-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not parse URIs during LSP serialization/deserialization #76691
base: main
Are you sure you want to change the base?
Conversation
a6bf829
to
e0e73bb
Compare
/// TODO: document. | ||
/// Converts the LSP spec URI string into our custom wrapper for URI strings. | ||
/// We do not convert directly to <see cref="System.Uri"/> as it is unable to handle | ||
/// certain valid RFC spec URIs. We do not want serialization / deserialization to fail if we cannot parse the URI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider linking to the runtime issue.
// Valid URI, but System.Uri cannot parse it. | ||
[InlineData(true, "perforce://@=1454483/some/file/here/source.cs")] | ||
[InlineData(false, "perforce://@=1454483/some/file/here/source.cs")] | ||
public async Task TestOpenDocumentWithInvalidUri(bool mutatingLspWorkspace, string uriString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new tests for these URIs
// If either of the URIs cannot be parsed, we'll compare the original URI strings. | ||
if (otherUri.ParsedUri is null || this.ParsedUri is null) | ||
{ | ||
return this.UriString == otherUri.UriString; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this must be false, right? because we did this exact check above, and would have returned 'true' if it succeeded. so why not return false here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that should definitely return false. good catch
/// In order to gracefully handle these issues, we defer the parsing of the URI until someone | ||
/// actually asks for it (and can handle the failure). | ||
/// </remarks> | ||
internal sealed class DocumentUri : IEquatable<DocumentUri> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main new wrapper type
/// </summary> | ||
internal class DocumentUriConverter : JsonConverter<Uri> | ||
internal class DocumentUriConverter : JsonConverter<DocumentUri> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new serializer for new wrapper type
private class LspUriComparer : IEqualityComparer<Uri> | ||
{ | ||
public static readonly LspUriComparer Instance = new(); | ||
public bool Equals(Uri? x, Uri? y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to the new type
@@ -112,7 +62,7 @@ public int GetHashCode(Uri obj) | |||
/// the URI. | |||
/// <para/> Access to this is guaranteed to be serial by the <see cref="RequestExecutionQueue{RequestContextType}"/> | |||
/// </summary> | |||
private ImmutableDictionary<Uri, (SourceText Text, string LanguageId)> _trackedDocuments = ImmutableDictionary<Uri, (SourceText, string)>.Empty.WithComparers(LspUriComparer.Instance); | |||
private ImmutableDictionary<DocumentUri, (SourceText Text, string LanguageId)> _trackedDocuments = ImmutableDictionary<DocumentUri, (SourceText SourceText, string LanguageId)>.Empty; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest of this file is interesting - contains changes to attempt to parse and handle issues
|
||
namespace Microsoft.CodeAnalysis.ExternalAccess.Razor; | ||
|
||
internal static class RazorUri | ||
{ | ||
[Obsolete("Use RazorUri.GetUriFromFilePath instead")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Shouldn't this tell people to use CreateAbsoluteDocumentUri
?
Although, if changing to DocumentUri is a breaking change for Razor, is there a reason to keep this obsolete at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So changing to documenturi doesn't seem to break non-cohosted razor. I'm not as sure about this one
if (this.ParsedUri is null) | ||
{ | ||
// We can't do anything better than the uri string hash code if we cannot parse the URI. | ||
return this.UriString.GetHashCode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand why GetHashCode shouldn't only look at UriString, and let the Equals method do the rest.
It's perhaps not a real world problem so take this with a grain of salt, but I think GetHashCode looking at the parsed Uri means a dictionary/hashset operation can have different results for the same Uri, depending on whether someone happens to have called ParsedUri
yet. If GetHashCode is simpler, then things would collide, and the Equals method would save the day.
Potentially broken example (untested, obv):
var uri = "http://goo";
var documentUri1 = new DocumentUri(uri);
var documentUri2 = new DocumentUri(uri);
documentUri2.GetRequiredParsedUri();
var dict = new Dictionary<DocumentUri, object>();
dict.Add(documentUri1, new GiantObject());
Assert.True(dict.ContainsKey(documentUri2));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing ParsedUri
will cause the URI to attempt to be parsed (which gethashcode does). So 1 and 2 should have the same hash code (but I will verify as well).
If GetHashCode is simpler, then things would collide, and the Equals method would save the day.
In this case, the issue is two different URI strings (with different string hashcodes) should be considered equal. For example an encoded vs. unencoded URI that point to the same file path should be considered equal (as System.Uri
does).
There'd be no collision here because the hashcodes would be different, hence the need to look at both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I think maybe I read the null check on ParsedUri and forgot that would mean it would cause a parse.
As for the pause here - I need to update XAML with these changes, but have been working on infra. Will get back to this shortly |
Resolves dotnet/vscode-csharp#7638
Related to dotnet/runtime#64707
Problem
System.Uri
will throw attempting to parse URIs that havesub-delims
in the host name. This is because it does additional host name validation beyond what is defined in the URI RFC spec. See above linked issues for specific examples.When we get passed these URIs from VSCode, the server crashes because we cannot successfully deserialize the LSP URI string into a
System.Uri
. We are unable to easily recover from these as it happens before we get to the queue processing.Solution
The approach I took in this PR is to stop parsing URIs during LSP message deserialization and remove usages of
System.Uri
from LSP directly. Instead, I created a wrapper typeDocumentUri
which initially stores just the string representation of the URI (exactly how LSP defines URIs). TheSystem.Uri
can be optionally retrieved from this (which will parse it). Only places which need to extract information from the URI should retrieve theSystem.Uri
and must handle failures.This allows URI parsing to be delayed until much later (for example looking at the scheme or file path to find a matching document in the workspace). While we still cannot parse the URI, we can handle the error and provide a misc document instead of crashing the server. These aren't normal files anyway (they don't conform to the
file:///<path>
format) and would go to misc regardless of if we succeed in parsing it or not.If at some point the runtime adds a new parsing mode, or we switch to our own URI parser (similar to O#), this still provides value. Deserialization of the URI string and parsing the URI string are two separate logical operations and should not be tied together. Additionally, any runtime improvements here would only be available in .NET 10.
Key areas to look at
DocumentUri
DocumentUriConverter
LspWorkspaceManager
ProtocolConversions
Razor PR - https://github.com/dotnet/razor/pull/11390/files